Performance guarantees for hierarchical clustering
نویسندگان
چکیده
منابع مشابه
Performance Guarantees for Hierarchical Clustering
We show that for any data set in any metric space, it is possible to construct a hierarchical clustering with the guarantee that for every k, the induced k-clustering has cost at most eight times that of the optimal k-clustering. Here the cost of a clustering is taken to be the maximum radius of its clusters. Our algorithm is similar in simplicity and efficiency to common heuristics for hierarc...
متن کاملOn Randomly Projected Hierarchical Clustering with Guarantees
1 Hierarchical clustering (HC) algorithms are generally limited to small data instances due to their runtime costs. Here we mitigate this shortcoming and explore fast HC algorithms based on random projections for single (SLC) and average (ALC) linkage clustering as well as for the minimum spanning tree problem (MST). We present a thorough adaptive analysis of our algorithms that improve prior w...
متن کاملThe Case for Hierarchical Schedulers with Performance Guarantees
Audio and video applications, process control, agile manufacturing and even defense systems are using commodity hardware and operating systems to run combinations of real-time and non-real-time tasks. We propose an architecture that will allow a general-purpose operating system to schedule conventional and real-time tasks with diverse requirements, to provide flexible load isolation between app...
متن کاملHIERARCHICAL DATA CLUSTERING MODEL FOR ANALYZING PASSENGERS’ TRIP IN HIGHWAYS
One of the most important issues in urban planning is developing sustainable public transportation. The basic condition for this purpose is analyzing current condition especially based on data. Data mining is a set of new techniques that are beyond statistical data analyzing. Clustering techniques is a subset of it that one of it’s techniques used for analyzing passengers’ trip. The result of...
متن کاملImproving OLAP Performance by Multidimensional Hierarchical Clustering
Data-warehousing applications cope with enormous data sets in the range of Gigabytes and Terabytes. Queries usually either select a very small set of this data or perform aggregations on a fairly large data set. Materialized views storing pre-computed aggregates are used to efficiently process queries with aggregations. This approach increases resource requirements in disk space and slows down ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computer and System Sciences
سال: 2005
ISSN: 0022-0000
DOI: 10.1016/j.jcss.2004.10.006